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Abstract  This  paper  considers  flexible  conditional  (regression)  measures 
of  market  risk.  Value-at-Risk  modeling  is  cast  in  terms  of  the  quantile  re- 
gression function  -  the  inverse  of  the  conditional  distribution  function.  A 
basic  specification  analysis  relates  its  functional  forms  to  the  benchmark 
models  of  returns  and  asset  pricing.  We  stress  important  aspects  of  measur- 
ing the  extremal  and  intermediate  conditional  risk.  An  empirical  application 
characterizes  the  key  economic  determinants  of  various  levels  of  conditional 
risk. 


*  We  thank  Takeshi  Amemiya,  Herman  Bierens,  Emily  Gallagher,  Roger 
Koenker,  Mary  Ann  Lawrence,  Tom  MaCurdy,  Warren  Huang,  an  anonymous  ref- 
eree, and  participants  of  seminars  at  Stanford  University,  University  of  Mannheim, 
Midwest  Finance  Association  Risk  Session,  CIRANO,  International  Conference 
on  Economic  Applications  of  Quantile  Regression  for  many  useful  conversations 
and/or  comments.  Very  special  thanks  to  Bernd  Fitzenberger  who  provided  ex- 
tremely useful  comments  and  corrections  as  an  editor. 
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1  Introduction  and  Conclusion 

Value-at-Risk  (hereafter,  VaR)  is  a  most  widely  used  measure  of  market 
risk,  employed  in  the  financial  industry  for  both  the  internal  control  and 
regulatory  reporting.  We  explore  various  aspects  of  VaR  modeling  based  on 
the  median/  quantile  regressions  (  cf.  Hogg  (1975),  Bassett  and  Koenker 
(1978),  Koenker  and  Bassett    (1978)): 

—  Conditional  VaR  modeling  is  cast  in  terms  of  the  regression  quantile 
function  -  the  inverse  of  the  conditional  distribution  function.  We  re- 
late the  functional  forms  of  conditional  quantiles  to  the  basic  statistical 
models  of  returns  and  the  models  of  asset  pricing  and  arbitrage.  The 
conditional  quantile  models  are  semi-parametric  in  nature  and  are  con- 
siderably more  flexible  than  most  commonly  used  parametric  methods 
(section  2). 

—  The  key  econometric  aspects  of  conditional  VaR  measurement  are  dis- 
played (section  3).  In  particular,  we  address  estimation,  inference,  and 
specification  analysis  of  high  and  intermediate  conditional  risk.  A  fun- 
damental problem  in  measuring  high  conditional  risk  is  the  lack  of  data 
on  high  risk  events,  which  requires  the  considerations  of  extreme  value 
theory  (see  Chernozhukov  (1999a)  for  a  theoretical  account). 

—  An  extensive  empirical  section  exposes  the  methods.  We  estimate  and 
analyze  the  conditional  market  risk  of  an  oil  producer  stock  price  as 
a  function  of  the  key  economic  variables.  We  find  that  these  variables 
impact  various  quantiles  of  the  return  distribution  in  a  very  differential 
and  nontrivial  manner.  The  key  determinants  of  the  extremal  and  inter- 
mediate conditional  risk  are  characterized.  The  market  index  (DJI)  is 
found  to  be  the  only  statistically  significant  determinant  of  the  extremal 
risk.  The  other  key  variables  may  also  exhibit  very  large  effect.  However, 
the  direction  of  the  effect  can  not  be  isolated  due  to  the  scarcity  of  data 
on  high  risk  events. 

We  hope  our  views  are  useful  to  the  reader.  We  also  recommend  other 
works  that  consider  regression  quantile  modeling  in  value-at-risk  and  related 
problems  in  finance:  Bassett  and  Chen  (1999),  Engle  and  Manganelli  (1999), 
Heiler  and  Abberger  (1999),  Taylor  (1999),  among  others. 

2  Modeling  Risk  Conditionally 

In  this  section  we  discuss  modeling  VaR  and  related  Market  Risk  measures 
(MRMs)  via  conditional  quantiles  and  other  techniques. 
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2.1  Preliminaries 

The  setting  we  consider  is  as  follows: 

—  Yt  is  the  price  process  of  the  security  or  portfolio  of  securities. 

—  Xt  is  taken  to  be  a  "state  process"  or  "information"  vector.  In  practical 
applications,  Xt  usually  consists  of  prices  (or  returns)  of  securities,  mar- 
ket indexes,  interest  rates,  spreads,  yields,  and  the  like.  We  may  allow 
Xt  to  grow  with  the  length  of  the  sampling  period.  Lagged  values  of  Yt 
and  functions  of  such  lagged  values  (e.g.  exponentially- weighted  sample 
volatility)  may  or  may  not  be  included  in  Xt-1 

The  Return  of  a  portfolio  with  price  process  Yt  over  [t,  t  +  h)  is 

y?  =  \nYt+h-}nYt. 

The  Conditional  Value  at  Risk  (VaR),  Vt(p),  is  defined  as  a  level  of 
return  y^  over  the  period  of  [t,t  +  h)  that  is  exceeded  with  probability  p 

(pe(o,i)):2 

t£(p)=inf{i;:Pt(yth<t;)>l-p}. 

V 

Alternatively,  this  can  be  written  in  terms  of  the  conditional  distribution 
function  of  y^: 

v?(j?)  =  F-h1(l-p\Xt), 

Vt 

where  F~h  (-\Xt)  is  an  inverse  of  the  cdf  F '  h(-\Xt)  or  the  so-called  conditional 
quantile  function.  Let  us  call  p  and  r  =  1  —  p  -  the  confidence  level  and  the 
index  of  VaR,  respectively.  The  Conditional  VaR  curve  is  the  conditional 
VaR  viewed  as  a  function  of  the  confidence  level. 

Similarly,  the  Extreme  VaR  may  be  defined  as  the  maximum  possible  loss 
over  a  period  of  time.  In  this  regard,  the  extreme  VaR  may  be  introduced  as 
the  limit  form  of  the  non-extreme  VaR  for  confidence  levels  p  approaching 
1: 

^(1)  =  lim  F~l(l  -  p\Xt)  =  inf  {v  :  Ft(y?  <  v)  >  0}. 

Correspondingly,  extremal  VaR  are  VaR  measures  with  p  close  to  1. 


1  Using  the  standard  notation,  time  subscript  under  the  expectation  E[],  prob- 
ability P(),  df  F(),  or  density  /(■)  denotes  conditioning  on  the  information  set 
Xt. 

This  definition  allows  to  avoid  ambiguity  when  the  distribution  of  y^  is  atomic. 
Otherwise,  the  definition  is  equivalent  to  Vf{p)  =  {v  :  ft{yt  <  v)  =  1  —  p}- 
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As  noted,  the  superscript  h  specifies  the  length  of  the  time  interval.  (We 
drop  the  superscript  in  the  sequel  to  simplify  notation.)  In  the  practical  ap- 
plications, h  is  typically  chosen  to  be  2  weeks  for  the  regulatory  reporting 
3  (which  typically  translates  to  10  business  days  for  local  applications  or 
12  business  days  for  around-the- globe  trading  applications,  such  as  Forex 
desks).  1-day  intervals  are  used  for  the  quality  assessment  of  banks'  VaR 
models  by  regulators.  It  is  also  commonly  used  for  the  internal  risk  man- 
agement, although  other  values  of  h  (up  to  one  month)  are  hot  uncommon. 
There  are  two  primary  reasons  why  we  are  not  interested  in  measuring 
market  risk  for  time  periods  longer  than  one  month:  (1)  market  risk  events 
typically  happen  during  short  intervals,  and  (2)  losses  due  to  market  risk 
events  can  be  relatively  easily  restored  in  short  periods  of  time  (reduction 
of  balance  sheet,  refreshment  of  capital,  etc.),  especially  if  the  market  risk 
event  is  not  accompanied  by  the  liquidity  or  systemic  risk  events.  This  con- 
trasts with  the  measures  of  the  credit  risk,  which  have  to  be  estimated  over 
the  instruments'  lifetime  (which  could  be  as  long  as  10  to  30  years). 


2.2  Market  Risk  Measures  and  Conditioning 

MRMs,  as  statistical  estimators,  can  be  parametric,  semi-parametric,  and 
non-parametric,  depending  on  the  strength  of  identifiability  assumptions 
made  by  the  methods.4  The  most  commonly  used  parametric  MRM  is  the 
variance-covariance  method  that  assumes  (conditionally)  normal  returns 
(implemented,  for  example,  in  J. P.  Morgan's  RiskMetrics).  The  conditional 
quantile  methods  with  parametric  and  non-parametric  functional  forms  fall 
into  the  semi-parametric  and  the  non-parametric  classes,  respectively. 


3  See  FedReg  (1996)  for  the  review  of  the  1996  Risk  Amendment  to  the  Basle 
Capital  Accord  of  1988. 

4  Another  fundamentally  different  method  is  the  stress-testing,  in  which  one 
defines  "basis  of  probable  events"  (such  as  various  parallel  shifts  of  the  Trea- 
sury yield  curve,  changes  of  the  yield  curve  slope,  and  others  used  in  the  DPG 
guidelines),  "highly  unlikely  events"  (such  as  a  drop  by  25%  in  the  S&P500)  and 
"structurally  impossible  events"  (such  as  a  drop  by  25%  in  the  S&P500  accompa- 
nied by  a  large  increase  in  DJI)  and  examines  the  behavior  of  the  portfolio  under 
various  such  events  or  shocks.  MRMs  of  the  last  type  may  offer  valuable  insight 
about  the  market-risk  properties  of  a  security  or  a  portfolio.  The  methods  are  less 
statistical  and  more  experimentation-based  in  their  nature,  yet  they  could  be  use- 
fully combined  with  statistical  methods,  especially  when  data  is  not  informative 
about  the  extremal  events. 
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The  type  of  conditioning  is  another  vital  dimension  along  which  MRMs 
can  be  classified.  The  following  two  types  are  not  mutually  exclusive,  yet 
the  proposed  distinction  will  be  useful. 

(a)  Moving  Window  Sampling  -  Regime  Conditioning 

Many  of  the  historical  and  parametric  MRMs  are  based  on  the  samples 
of  certain  (often  moving)  width  (e.g.  a  window  of  the  last  100  returns). 
The  sample  is  then  weighted  or  not  weighted  to  compute  the  parameter 
estimates.  For  example,  the  "historical  volatility"  approaches5  use  geometric 
weighting.  In  this  case,  such  measures  may  be  seen  as  GARCH  models 
that  have  recursive  conditional  variances.6  Thus  such  methods  could  be 
interpreted  as  the  regression  measures  described  next. 

Oftentimes,  however,  no  weighting  is  done  and  no  interpretation  of  the 
above  kind  is  provided.  What  is  achieved  by  such  a  form  of  conditioning? 
A  primary  goal  is  to  robustify  the  MRM  against  the  structural  changes 
or  regime  switches.  That  is,  the  whole  history  may  not  be  an  appropriate 
sample  for  measurement/estimation  due  to  structural  changes  that  have 
occurred  in  the  past.  Indeed,  the  dynamics  of  oil  prices  in  the  70s  is  probably 
very  different  from  that  now.  Therefore,  considering  the  equally  weighted 
sample  of  moving  width  is,  most  of  all,  a  method  for  disregarding  un-  or 
dis-  informative  data.  Thus,  such  procedures  could  be  seen  primarily  as 
methods  of  conditioning  on  the  environment  and  as  a  way  to  guard  against 
misspecification  w.r.t.  a  given  historical  period. 

(b)  Regression  -  Conditioning  on  the  information  Xt 

In  addition  to  considering  the  informative  data,  a  risk-modeler  seeks 
to  produce  a  measure  of  risk  conditionally  on  the  set  of  the  key  economic 
("state")  variables  Xt-  For  example,  one  may  wish  to  characterize  volatil- 
ity of  the  oil  return  as  a  function  of  the  oil  spot  price,  key  exchange  rates, 
etc.  Such  a  form  of  conditioning  is  a  regression  characterization  of  risk. 
Regression  seeks  to  describe  the  moments  of  return  (generally,  distribu- 
tion or  quantiles)  as  functions  of  Xt.  The  sample  analogues  of  regression 
dependencies  are  regression  estimators.  Examples  of  MRM  of  this  kind  in- 
clude frequently  used  parametric  methods,  such  as  the  normal  GARCH, 
and  the  class  of  quantile  regression  MRMs  treated  in  this  paper.  Indeed, 
GARCH  represents  volatility  as  a  function  of  Xt ,  the  regression  quantile 


5  Implemented  in  RiskMetrics. 

6  Standard  GARCH  models  also  assign  geometric  weights  to  the  squares  of  past 
innovations,  and  the  "moving  window"  results  from  truncating  the  sum  once  the 
weights  are  sufficiently  small. 
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method  describes  the  quantiles  of  return  distribution  of  Vt\Xt.  Note  that 
most  non-parametric  or  "historical"  methods,  such  as  unconditional  quan- 
tile  or  moment  estimation  are  not  regression  methods  in  that  the  regression 
dependence  is  simply  not  modeled.  An  example  of  a  non-parametric  method 
that  gives  a  regression  measure  is  a  non-parametric  estimator  of  the  con- 
ditional density  function  (from  which  a  conditional  VaR  estimate  can  be 
computed  -  see  Ait-Sahalia  and  Lo  (1998)). 7 

2.3  Quantile  Regression 

QR  flexibly  allows  us  to  directly  model  the  conditional  VaR,  utilizing  only 
the  pertinent  information  that  determines  quantiles  of  interest.  This  is  in 
sharp  contrast  with  the  traditional  methods  that  use  information  on  the 
central  moments  of  conditional  distribution  -  mean,  variance,  kurtosis,  etc. 
-  to  construct  the  VaR  estimates.  This  primary  feature  of  QR  is  especially 
important  for  modeling  intermediate  and  extremal  conditional  VaR. 

Envisioning  the  essence  of  the  conditional  quantile  modeling  is  quite 
easy.  Fix  any  p,  a  confidence  level.  Assume  some  functional  dependence  of 
VaR  on  the  information  variables  Xt: 

vt(p)  =  F-^rlXt)  =  mp(Xt,(3(p))- 

For  any  p  (or  a  set  of  p's)  we  could  suitably  pick  one  or  the  other  form  of  mp 
to  match  the  observed  historical  data  sufficiently  well  through  a  set  of  sam- 
ple moment  restrictions.8  This  matching  is  implemented  through  the  means 
of  quantile  regression  and  related  methods  (see  section  4).  In  this  paper  we 
exclusively  confine  our  attention  to  the  linear  (polynomial)  analysis. 


3    A  Basic  Specification  Analysis 

Linear  (Polynomial)  Models 


7  To  name  a  few  disadvantages  of  the  fully  non-parametric  approach:  (1)  com- 
putational, and  (2)  due  to  model  complexity,  it  often  entails  a  significant  loss 
of  predictive  ability  in  moderate-sized  samples  and  cases  of  many  conditioning 
variables. 

8  In  fact,  the  choice  of  mp  could  be  arbitrarily  flexible  -  non-parametric.  For 
example,  mp  can  be  modeled  through  a  composition  of  basis  functions  of  a  pre- 
determined class. 
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A  fundamental  model  is  the  linear  model  of  conditional  quantile  function 
(VaR): 

vt(P)  =  F-^rlXt)  =  X[  /?(p),  (3.1) 

where  Xt  is  a  d-dimensional  vector  representing  any  desired  transforms 
(powers,  etc.)  of  Xt,  including  a  constant.  We  next  discuss  both  the  plausi- 
bility and  restrictiveness  of  such  model. 

(a)  Location- Scale  Models 

Linear  model  (3.1)  naturally  arises,  for  example,  from  an  AR  formula- 
tion of  the  return  dynamics.  Indeed,  consider  a  simple  linear  location-scale 
model: 

yt  =  X'ta  +  X't\ut, 
Ut  is  independent  of  Xt  ,      P  (ut  <  0)  =  1/2. 

Model  (3.2)  is  a  location-scale  model,  where  both  location  X'ta  and  scale 
X[\  >  0  are  parameterized  as  linear  functions.  Furthermore,  location  X[a 
is  the  conditional  median  function.  We  should  also  impose  scale  restrictions 
to  identify  A,  e.g.  E\ut\  =  1  or,  alternatively,  ||A||  =1.  Since  models  like  (3.2) 
are  not  used  in  sequel,  we  omit  any  further  discussion  of  identification. 
We  can  define  the  "shock"  term  ut  in  a  (perhaps)  more  familiar  way: 

yt  =  X'ta  +  X[\ut, 
ut  is  independent  of  Xt  ,     E  (ut)  =  0,  E  (uj )  =  1. 

In  this  case,  location  X[a  is  the  conditional  mean  function,  and  X[X  >  0 
is  the  scale  ((X'tX)2  is  the  conditional  variance).  Note  that  models  such  as 
(3.3)  are  directly  justifiable  from  the  standard  APT  and  other  factor  models 
(see  Campbell,  MacKinlay,  and  Lo  (1997)). 

It  is  very  plausible  that  either  model  generates  a  linear  conditional 
quantile  model.  Denote  by  Fu  the  distribution  function  of  ut.  Then  clearly 
vt(p)  =  X[a  +  X'tXFu-\r)  =  X't(3{p). 

(b)  N on- Location- Scale  Models 

While  a  linear  location-scale  model  implies  a  linear  VaR/ quantile  func- 
tion, the  converse  may  not  be  true.  Indeed,  it  is  easy  to  see  that  the  location- 
scale  model  necessarily  involves  monotone  coefficients  /3(p)  in  the  quantile 
index  p,  whereas  (3.1)  imposes  no  such  assumption.  In  either  model  (3.2) 
or  (3.3),  it  is  assumed  that  Ut  is  independent  of  Xt-  In  general  ut  will  be 
not  independent  of  Xt ,  and  all  such  cases  form  the  class  of  non-location- 
scale  models.  It  is  not  difficult  to  see  how  these  cases  can  generate  the 
linear/polynomial  forms. 
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Polynomial  Models  as  Approximations  of  Non-linear  Models 

More  generally,  the  linear/polynomial  specifications  can  be  considered  as 
approximations  of  nonlinear  models  of  the  form  yt  =  /-i(Xt)+o-(Xt)ut.  Since 
it  is  clear  how  this  approximation  works,  we  omit  any  further  discussion. 

Recursive  Specifications 

Models  that  we  have  considered  so  far  represent  VaR  as  a  function  of 
only  a  few  key  economic  variables  and  their  transforms,  Xt.  It  may  be 
desirable  to  consider  models  that  reflect  the  whole  past  information  Xt- 
Nonlinear  dynamic  models  allow  for  a  wide  variety  of  such  specifications.  For 
example,  let  Xt  be  the  subset  of  information  variables  (or  their  transforms) 
that  become  available  in  the  r.-th  period  and  Xt-\  be  variables  {Xj}^Z.q,  so 
that  Xt  =  {Xt,  Xt-i).  A  general  recursive  specification  can  then  be  obtained 
as  follows: 

vt{jp)  =  X'ta{p)  +  b{jp)h{Xt,p) 

fl{xup)  =  c{p)fl{xt-uP)  +  h{Xtid{p)) 

Linear  forms  in  (3.4)  can  be  replaced  by  nonlinear  forms.  Restrictions,  guar- 
anteeing stability  of  the  model,  must  be  imposed  on  a,b,c,fi,f2-  Models 
of  this  sort  were  considered  in  Weiss  (1991),  Koenker  and  Zhao  (1996), 
Engle  and  Manganelli  (1999).  Models  like  (3.4)  are  useful  as  parsimonious 
regressions  that  represent  value-at-risk/quantiles  as  a  function  of  all  past  in- 
formation. In  contrast,  the  simpler  markovian  specifications  posit  that  most 
of  relevant  information  is  contained  in  a  few  past  lags  of  the  key  variables. 
The  model  (3.4)  can  typically  be  solved  recursively  to  eliminate  /i ,  yield- 
ing nonlinear  dynamic  functional  forms,  which  can  be  used  in  a  usual  non- 
linear estimation.  One  example  involves  the  model 

f1(Xup)  =  a(Yt-fH\Xt),  (3.5) 

where  o-2(yt  —  Ht\Xt)  is  the  conditional  variance  of  the  de-meaned  return. 
It  can,  for  example,  take  the  form  of  a  GARCH  model  (see  Koenker  and 
Zhao  (1996)  for  discussion).  In  practice,  a  simple  strategy  to  accommo- 
date a  model  like  (3.4)  -(3.5)  into  a  linear  framework  is  to  first  estimate 
a(yt  —  fJ-t\Xt)  via  a  GARCH  model  and  use  it  as  a  regressor  in  the  earlier 
linear  model:  Xt  =  {X't,  a(yt  -  Ht\Xt))' ' ■  Of  course,  that  the  regressor  is  esti- 
mated can  be  accommodated  in  the  inference  analysis  or,  alternatively,  can 
be  ignored  for  all  practical  purposes.  In  the  empirical  section  we  will  not 
make  use  of  the  recursivity,  since  we  are  more  interested  in  the  economic 
determinants  of  risk,  but  most  of  techniques  discussed  next  apply  both  to 
the  linear  and  nonlinear  specifications. 
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In  summary,  it  is  important  to  stress  that  the  conditional  quantile  model 
imposes  no  strong  assumptions  about  the  distribution  function  of  the  un- 
derlying error  term.  This  underscores  the  semi-parametric,  flexible  nature 
of  the  conditional  quantile  models,  which  could  be  valuable  to  model  the 
market  risk. 


4  Estimation/Inference 

This  section  reviews  some  recent  results  on  estimation  and  inference.  Sec- 
tion 4.1  reviews  regular  asymptotic  estimation  and  inference  in  the  quantile 
regression  literature.  Section  4.2  reviews  results  of  Chernozhukov  (1999a) 
on  estimating  high  and  low  (extremal)  regression  quantiles. 

4-1  Estimation  and  Inference 

The  Conditional  VaR  function,  vt(p),  is  parameterized  as  m(Xt,/3(r),T), 
which,  in  our  empirical  setting,  takes  the  linear-quadratic  form  X^P(t). 
Hence  we  will  discuss  this  case  only  (to  keep  notation  simple).  The  nonlinear 
case  is  similar  -just  replace  Xt  by  the  derivative  dm{Xt,P,T)/d(3\j}^,  and 
Xtfbym(Xt,(3,T). 

The  sample  analogue  of  moment  conditions  that  define  the  quantile  func- 
tion, integrated  with  respect  to  (3(t)  yields  the  quantile  regression  objective 
function  Qt  (Koenker  and  Bassett,  1978).  /3(t)  is  denned  as  the  argmin  of 

Qr- 

P(t)  =  argmin  fi  [qt(P,  t)  =  £  pT{vt  ~  Xt/3)] ,  (4.6) 

t 

where  pT(x)  =  tx~  +  (1  —  t)x+.  For  a  given  r,  \/T((3(t)  —  /3(r))  is  asymp- 
totically normal  under  general  dependence  and  heterogeneity,9  and,  further- 
more, the  regression  VaR  coefficients  converge  to  a  Gaussian  process  G(-), 
10  as  functions  of  r11 

JT{K-)  -W)  =>G(-).  (4.7) 


9  See  e.g.  Portnoy  (1991),  Fitzenberger  (1998),  Weiss  (1991)  (nonlinear  cases), 
Koenker  and  Zhao  (1996). 

10  See  Portnoy  (1991)  and  also  the  results  in  Chernozhukov  (1999b)  that  allow 
for  non-linear  specifications  and  various  forms  of  dependent  data. 

11  Here  =>  denotes  weak  convergence  in  £°°  -  see  e.g.  van  der  Vaart  and  Well- 
ner  (1996).  G(-)  =  J":(l  -  -)GPf{W,  ■),  f(W,r)  =  (1(7  <  X'/3{t))  -  t)  X,  and 
W  =  (X,Y).  Gp(f(W,r))  is  zero-mean  random  Gaussian  function  of  r  (=  1  —  p), 
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Inference  is  facilitated  by  estimating  appropriate  variance-covariance  matri- 
ces by  a  moving-block  bootstrap  (Politis  and  Romano  (1994),  Fitzenberger 
(1998)). 

Test  Processes  and  their  Functionals 

Testing  is  discussed  in  detail  in  e.g.  Koenker  and  Portnoy  (1999)  and 
Weiss  (1991).  Koenker  and  Machado  (1999)  and  also  Chernozhukov  (1999b) 
study  several  forms  of  tests  and  test  processes  (test  statistics  viewed  as  func- 
tions of  t).  These  test  processes  are  variants  of  Wald,  Score,  quasi-LR,  and 
specification  test  statistics,  viewed  as  functions  of  r  in  an  interval  V  (e.g. 
V  =  [.1,.9]).  The  quasi-LR,  Wald,  and  rank-score  test  processes  are  in- 
troduced and  studied  in  Koenker  and  Machado  (1999)  in  the  context  of 
independent  data  and  linear  functional  forms.  These  test  processes,  as  well 
as  some  of  their  alternatives,  are  studied  in  Chernozhukov  (1999b),  un- 
der conditions  of  dependence  and  nonlinearities.  Test  processes  are  shown 
to  be  asymptotically  distributed  as  quadratic  forms  of  Gaussian  processes 
(P-Bessel  Processes),  and  each  coordinate  of  the  test  statistics  is  asymp- 
totically distributed  as  chi-squared.  Bootstrap  inference  is  also  discussed 
there.  The  specification  test  process  is  introduced  Chernozhukov  (1999b). 
The  coordinates  Sc{t)  of  the  specification  test  process  Sc(-)  are  defined  as 
quadratic  forms  of  the  usual  specification  test  statistic  (  such  as  gmm-like 
overidentification  statistics  or  statistics  like  those  in  Bierens  and  Ginther 
(1999)).  A  simple,  practical  version  defines  Sc(t)  as  a  quadratic  form  of 
t  Stli  [Wvt  —  X'tf}{r))  -  r)Zt],  where  Z*'s  are  functions  of  Xt  or  other 
information  variables  (other  than  Xt).  The  specification  test  process  con- 
verges weakly  to  a  generalized  P-Bessel  process.  By  selecting  various  forms 
of  Zt  one  can  check: 

—  conditionality  -  ability  to  incorporate  all  relevant  past  information  (with 
Zt  equal  to  past  information  variables) 

—  functional  form  validity    (with  Zt  equal  to  various  transforms  of  Xt). 


exactly  a  P-Brownian  Bridge  -cf.  van  der  Vaart  and  Wellner  (1996),  whose  dis- 
tribution is  defined  by  the  finite-dimensional  normal  distributions  and  the  co- 
variance  kernel  CV(f(W,n),f(W,  r,))  =  limr-»<x>  £  £f=i  E(/  {Wt,n)  f  (Wt,r,)') 
+  ZLiHfCWt+k^fiWt,^)'  +f(Wt,Ti)f(Wt+k,Tjy]  ;  and  finally,  J(t) 
is  a  fixed  non-stochastic  invertible  matrix  J(r)  =  limT-Kx>  ^  2t=i  E 
[fyt{F~tl{T\Xt)\Xt)XtX't].  Note  that  (4.7)  is  very  succinct  statement,  conve- 
niently characterizing  the  distribution.  For  example,  (v/T(/3(tj)  —f3{Ti),  j  =  1,  ...I) 
converges  in  distribution  to  N(0,A),  where  ij-th  block  Aij  =  A'ji  of  matrix  A  is 
given  by  J-^T^CVifiW,^),  f(W,Tj))  J"1^)'. 
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Various  functionals  of  these  processes  form  test  statistics  that  enable  si- 
multaneous tests  on  all  t  €  V.  For  example,  supTg.p  Sc(t)  forms  a  global 
specification  test  of  the  Kolmogorov-Smirnov  type.  It  examines  validity  of 
the  functional  form  of  the  conditional  quantile  function  for  all  r  in  V  simul- 
taneously.12 

Simpler  versions  of  the  (pointwise)  conditionality  tests  (that  disregard 
that  P(t)  is  an  estimated  quantity)  can  be  constructed  by  regressing  (l(yt  < 
X[(3{t))  —  t)  on  the  lagged  values  of  itself  and  other  information  variables, 
and  then  checking  if  the  regression  coefficients  are  zero.  These  and  other 
tests  are  suggested,  to  evaluate  VaR  models,  by  Lopez  (1998),  Christoffersen 
(1998),  Crnkovic  and  Drachman  (1996),  Diebold,  Gunther,  and  Tay  (1998), 
Engle  and  Manganelli  (1999),  among  others.  Such  procedures  offer  a  good 
simple  way  to  quickly  check  if  a  given  VaR  model  is  more  or  less  plausible. 
Lopez  (1998)  offers  a  valuable  detailed  discussion  on  the  criteria  with  which 
VaR  models  can  be  evaluated. 

Definitions  of  the  Test  Processes  for  the  Empirical  Section 

In  the  empirical  section  we  compute  Wald,  integrated  Wald,  and  quasi- 
Score  test  processes  at  a  sequence  of  values  of  p  in  order  to  test  the  hypoth- 
esis: 

H0:Pi(t)  =  0, 
where  f3  =  (Po(t),Pi  (t)')'»  and  A)(T)  is  tne  intercept  parameter.  Hypothesis 
H0  states  that  the  coefficients  on  all  conditioning  variables  are  zero.  If  H0 
is  true,  it  means  that  conditioning  is  statistically  irrelevant. 

To  define  these  tests  denote  the  constrained  quantile  regression  coeffi- 
cient as  J3r(t)  =    arginf/3  Qt(P,t)  s.t.    /3i(t)  =  0.  Then  define  the  Wald 


Given  the  stochastic  equicontinuity  of  the  test  processes,  their  distribution 
can  be  approximated  via  finite  subnets  of  the  processes  evaluated  at  the  grid  of 
equi-distant  quantile  indices  {n,i  £  1,  ...K}  for  K  sufficiently  large.  This  means 
that  only  a  finite  number  of  evaluations  should  be  considered  to  approximate  well, 
e.g.  supT6.p  Sc(t)  via  supTi  Sc(n).  If  the  approximation  error  is  to  vanish,  we  need 
K  — >  oo  as  T  — >  oo. 
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and  Score  test  processes13: 

w(-)  =  t0r(.)  -  p(-)yn1T(-)0R(-)  -  /?(■)), 

S(-)  =  Tv0QT0R(-),.)'n2T(.)v0QT0R(-),  ■). 

The  specification  test  process,  as  a  function  of  r,  is  defined  as: 

1      T 
Sc(-)  =  MO'rtsHOM-),  where  /xr(r)  =  — =  £(l(yt  <  X[(3{r))  -  r)Zt 

v -*   t=i 

The  specification  test,  as  stated  earlier,  checks  the  validity  of  the  given 
functional  form,  as  well  as  the  conditioning  ability.14 

4-2  Near-Extreme  Regression  Quantiles  (VaR) 

While  the  regular  asymptotics  is  certainly  useful  in  a  large  sample  for  non- 
extremal  regression  quantiles,  a  more  cautious  approach  is  needed  to  dis- 
tinguish a  separate  kind  of  inference  for  near-extreme  (extremal)  regression 
quantiles,  as  developed  in  Chernozhukov  (1999a).  What  follows  summarizes 
some  of  these  results.  Intuitively,  the  sample  quantile  regression  is  a  method 
of  generalizing  the  notion  of  order  statistics  to  the  regression  settings.  Cor- 
respondingly, for  any  given  sample  size  T,  the  r-th  regression  quantile  fit 
can  be  seen  as  the  rT-th  conditional  order  (or  rank)  statistic.  Depending 
on  this  rank,  asymptotic  approximations  reflect  the  extremal  or  rare  data 
considerations.  Indeed,  consider  a  simple  example  with  no  covariates.  A  .01- 
th  quantile  estimator  in  the  sample  of  100  is  the  first  order  statistics,  and 

13  (a)  How  to  choose  Oit,  &2T,  and  Qzt  ?  Each  of  the  tests  processes  is  of  the 
form  wt{t)' f2T{r)wT{r).  For  any  r,  Qt(t)  is  chosen  to  be  an  inverse  of  a  con- 
sistent estimator  of  the  asymptotic  variance  of  wt{t).  One  can  either  exploit  the 
analytical  expressions  for  these  variances  (as  discussed  in  Chernozhukov  (1999b)) 
or,  as  employed  in  the  empirical  section,  use  a  moving-block  bootstrap  to  esti- 
mate them.  The  procedure  is  quite  simple:  using  such  form  of  bootstrap,  compute 
statistics  u)6t(t~)  (for  b  =  \,...B  denoting  the  bootstrap  replications,  setting  B 
large).  Then  compute  the  variance  matrix  of  the  "sample"  {wbT{i~),b  —  1,...B}. 
(This  procedure  is  consistent  under  the  local  alternatives,  since  then  only  the 
asymptotic  mean  of  wt{t)  is  shifted  and  variance  is  unaffected.)  (b)  V/sQr  means 
^  Et6y  PAYt  -  Xi${T)),  /  =  {«  :  Yi  #  X'J(r)}. 

14  In  the  empirical  section  we  give  the  specification  test  process  for  the  linear 
model  with  Zt  selected  to  be  the  polynomial  forms  of  Xt  ■  That  is,  we  have  devoted 
our  attention  to  the  first  problem.  However,  one  can  check  the  conditionality  by 
selecting  the  appropriate  Zt's,  as  discussed. 
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hence  one  would  expect  that  regular  asymptotic  approximation  does  not 
apply  here,  but  some  other  asymptotic  theory  does.  Similar  considerations 
apply  to  regression  setting.  Indeed,  data  scarcity  is  amplified  by  the  presence 
of  covariates.  To  that  end,  the  concept  of  effective  rank  is  useful.  Effective 
rank,  r,  is  the  ratio  of  rank  k  =  tT  (  if  t  <  1/2,  and  (1  -  t)T  if  t  >  1/2  )  to 
the  number  of  regressors,  k/d.  To  motivate  such  a  notion,  consider  the  "re- 
gression" quantile  problem  in  a  sample  of  1000  observations  and  10  dummy 
regressors,  in  which  the  target  is  the  .01  —  th  conditional  quantile  function. 
The  constructed  estimates  of  slopes  will  be  the  1-st  lowest  order  statistics 
in  each  of  10  subsamples  corresponding  to  the  dummy  variables.  Hence  an 
extremal  situation  still  applies  here. 

Let  us  now  distinguish  two  formal  asymptotic  considerations/statistical 
experiments  that  account  for  the  data  scarcity  illustrated  above.  (In  the 
sequel,  we  always  assume  r  <  1/2.  Replace  t  by  (1  —  r)  if  r  >  1/2): 

(i)  t  — >  0,  tT  — >  oo  (intermediate  rank  behavior,  or  r  is  relatively  low  as 

compared  to  the  sample  size), 
(ii)  t  — »  0,  tT  — >  k  (extreme  rank  behavior,  or  t  is  very  low  as  compared 

to  the  sample  size). 

Both  (i)  and  (ii)  intend  to  asymptotically  capture  forms  of  data  scarcity 
arising  in  the  tails  of  distribution.  Both  can  be  applied  for  any  given  quantile 
index  of  interest  in  a  fixed  dataset,  and  these  notions  give  rise  to  alternative 
inference  techniques  that  can  be  applied  when  dealing  with  the  extremal  re- 
gression quantiles  (VaR).  Intuitively,  these  alternative  inference  approaches 
should  perform  better  for  the  near-extreme  regression  quantiles  than  for  the 
central  ones.  In  the  empirical  section,  these  alternative  techniques  are  used 
to  construct  the  confidence  intervals  of  the  regression  quantile  coefficients. 
A  "rule-of-thumb"  choice  of  the  (more)  appropriate  approximation  theory 
for  inference  purposes  is  as  follows:  if  r  <  10,  one  may  select  method  (ii)  (as 
illustrated  in  the  example  above),  if  r  €  (10,  25)  -  method  (i),  and  if  r  >  25 
-  the  regular  or  central  asymptotic  approximations  (previous  subsection). 
A  brief  description  of  the  asymptotics  under  (i)  and  (ii)  follows  next. 

Assume  that  the  conditional  distribution  function  of  yt  —  X[(3{q)  (q  = 
1/2  or  0)  is  tail-equivalent  to  some  function  K(x)Fu(-),15  where  Fu  is  a 
distribution  function  with  the  extremal  tail  types. 


We  say  that  cdf  F\  is  tail-equivalent  to  cdf  F2  at  (say  the  lower)  end-point  x, 
if  as  z  \  x,  F\{z) J Fi(z)  — >  1. 


14  Victor  Chernozhukov,  Len  Umantsev 

In  the  case  (i),  the  asymptotic  distributions  are  normal,  but  the  covari- 
ance  matrix  depends  on  the  tail  index:  (for  any  m  >  0,m  ^  1) 

where  Q™  =  limT  £  Ef=i  EXtX[,  Qn  =  limr  j,  £tT=1  EXtX[/U{Xt),  Z  is 
the  tail  index  of  Fu,  7i(x)  is  some  function  of  x  that  depends  on  the  tail 
type  of  Fu,  and  \ix  =  hmr  ^  Z!t=i  EXt.  Furthermore,  /j.'x(/3(m,T)  -  /?(t)) 
can  be  replaced  by  X  (/3(mr)  -  /3(r)),  without  affecting  the  validity  of  the 
result.  We  do  not  state  the  explicit  forms  here  since  the  result  is  used  only 
indirectly  to  justify  the  resampling  techniques  (that  will  be  used  to  construct 
the  confidence  intervals  in  the  empirical  section  -  see  the  next  section). 

In  the  case  (ii),  the  asymptotic  distributions  are  defined  by  a  random 
variable  that  solves  a  stochastic  optimization  problem,  where  the  objective 
function  is  an  integral  w.r.t.  a  Poisson  point  process: 

aT  (/?  (r)  -  (3  (r))  A  c(k)  +  arginf  [  -  k^xz  +  J (j  -  x'z)-<IN(j,  x) 

where  N  is  a  certain  Poisson  Point  Process.  The  mean  intensity  function 
of  N,  constants  c(k),  and  scaling  ar  depend  on  the  underlying  tail  type 
of  Fu  and  on  the  tail  heterogeneity  function  K(x).  Again,  we  do  not  state 
the  explicit  forms  here  since  the  result  is  used  only  indirectly  to  justify 
the  resampling  techniques  (that  will  be  used  to  construct  the  confidence 
intervals  in  the  empirical  section  -  see  the  next  section). 

Furthermore,  Chernozhukov  (1999a)  defines  and  studies  inference  pro- 
cesses analogous  to  those  in  the  previous  section16  and  shows  how  to  conduct 
inference  by  asymptotic  or  resampling  methods.  In  particular,  estimates  of 
tail  index  £,  17  tail  heterogeneity  function  K(x),  and  scaling  constants  ax 

16  Construction  of  quantile  and  inference  processes  is  done  by  introducing  an 
index  I  in  a  set  [Zi,  Z2],  so  that  process  or  1(3  (t-)  —  (3  (r)  I  is  a  function  of  I,  etc. 

17  A  simple  rule-of-thumb  estimator  for  the  empirical  section  is  deduced  from 
the  following  relationship  ^  j  T*l7  - 1 » /m ~ ^  — >  ■*■'  as  T^  — >  oo,t  \  0  (  but 
more  sophisticated  estimators  can  be  constructed  -  see  (Chernozhukov,  1999a)). 
So  that 

£l         \           1       X0(mr)-p(r))     „ 
£  (m,  t)  =  —  In  — — ^— — i—^i—    [nm 

X(P{t)  -  (?(rm-'))' 

In  practice  m  should  not  be  set  too  far  away  from  1.  E.g.  in  the  empirical  section, 
we  used  m  =  .75,  m  =  1.25,  and  various  values  of  t  s.t.  r,  the  effective  rank,  is 
between  10  and  20.  We  then  took  the  median  of  {£(mi,Tj)}  over  all  such  values 
of  mi  and  Tj  to  obtain  the  final  estimate  £. 
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are  offered,  and  the  validity  of  subsampling  is  established.  Alternative  re- 
gression quantile  estimators  emerge  from  these  results.  For  example,  a  re- 
gression quantile  extrapolation  estimator18  is  constructed  as  follows  (for  re 
close  to  zero  and  r  not  close),  and  any  positive  constant  m  ^  1: 


M/.uJTeA)-{-l 


Fy-^Telx) 


x'0{rnr)  -  P{t))    +  x'$(t 


m-«  -  1 
This  also  implicitly  defines  the  extrapolation  quantile  estimates  for  (3{re)  . 

5  Empirical  Analysis 

This  section  considers  estimating  the  VaR  of  the  Occidental  Petroleum 
(NYSE:OXY)  security  returns.  The  dataset  consists  of  2527  daily  obser- 
vations19 on 

—  ?/4,  the  one-day  returns, 

—  Xt,  a  vector  of  returns  (or  prices,  yields,  etc.)  of  other  securities  that 
affect  distribution  of  Yt  and/or  lagged  values  of  yt  itself:  a  constant, 
lagged  one-day  return  of  Dow  Jones  Industrials  (DJI),  the  lagged  return 
on  the  spot  price  of  oil  (NCL,  front-month  contract  on  crude  oil  on 
NYMEX),  and  the  lagged  return  yt. 

Generally,  to  estimate  the  VaR  of  a  stock  return,  Xt  may  contain  such 
variables  as  a  market  index  of  corresponding  capitalization  and  type  (for 
instance,  the  S&P500  Value  for  a  large-cap  value  stock),  the  industry  index, 
a  price  of  commodity  or  some  other  traded  risk  that  the  firm  is  exposed  to, 
and  lagged  values  of  its  stock  price.  It  is  also  conceivable  to  include  some 
unobserved  factors,  such  as  Size,  Value,  Momentum,  or  Liquidity  premiums, 
whose  effect  on  stock  returns  and  risk  has  been  a  subject  of  numerous  stud- 
ies. However,  we  chose  not  to  include  estimated  variables  in  the  information 
set  for  the  sake  of  simplicity. 

Functional  Forms  of  Conditional  Quantile  Functions 
Two  functional  forms  of  conditional  VaR  were  estimated: 

•  Linear  Model  :  v£(p)  =  X[  6(p), 

•  Quadratic  Model:  wth(p)  =  X't8{p)  +  XtB{p)X't. 


18  This  is  a  direct  regression  analogue  of  the  estimator  of  Dekkers  and  de  Haan 
(1989)  that  was  suggested  for  non-regression  cases. 

19  Prom  September  1986  to  November  1998 
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Conditional  Risk  Surfaces 

Figure  1  presents  surfaces  of  the  regression  VaR  functions  plotted  in  the 
time-probability  level  coordinates,  (t,p).  Recall  that  p  is  called  the  proba- 
bility level  of  VaR,  and  r  =  1  —  p  is  the  quantile  index.  We  report  VaR 
for  all  values  of  p  £  [.01,  .99].  The  conventional  VaR  reporting  typically 
involves  the  probability  levels  of  p  =  .99  and  p  =  .95.  Clearly,  the  whole 
VaR  surface  formed  by  varying  p  in  [.01,. 99]  represents  a  more  complete 
depiction  of  conditional  risk.  Note  also  that  since  one  can  be  either  long  or 
short  the  security,  estimation  of  VaR  in  both  tails  of  the  return  distribution 
is  of  interest. 

The  dynamics  depicted  in  figure  1  unambiguously  indicate  certain  dates 
on  which  market  risk  tends  to  be  much  higher  than  its  usual  level.  This  by 
itself  underscores  the  importance  of  conditional  modeling.  We  also  stress 
that  the  driving  force  behind  the  dynamics  is  the  behavior  of  Xt. 

Model  Comparison 

Figure  1  also  compares  the  dynamic  evolution  of  the  linear  and  the 
quadratic  VaR  surfaces.  Notably,  the  quadratic  model  predicts  higher  risk 
magnitudes  than  the  linear  model.  Indeed,  the  fluctuations  of  the  quadratic 
VaR  surface  are  significantly  larger.  The  linear  model  thus  predicts  a  more 
"smoothed  out"  VaR  surface. 

Conditional  Quantile  and  Quantile  Coefficient  Functions 

The  next  series  of  figures  presents  the  statistical  aspects  of  the  analysis. 
For  brevity,  we  chose  to  present  the  results  in  a  graphical  form.20 

Let  us  set  the  date  at  t  =  2500  to  analyze  the  VaR.  Figure  2  depicts  the 
estimated  VaR2soo{p)  for  values  of  p  in  the  interval  [.01,  .99].21  This  figure 
also  shows  the  95%  confidence  intervals  (c.i.)  obtained  by  the  following  pro- 
cedures:22 (1)  regular  inference,  based  on  the  asymptotic  normal  approxima- 
tion (labeled  as  "asymptotic"),  (2)  resampling  inference,  by  the  stationary 
bootstrap,  that  is  valid  under  regular  and  intermediate  rank  asymptotics, 
and  (3)  and  (4):  subsampling  inference  with  different  scaling  schemes,  de- 
noted as  "Subsampling  I"  and  "Subsampling  II,"  suited  for  dependent  data, 
and  valid  under  the  extreme  rank  asymptotics.  Method  (1)  is  intended  to 


20  We  have  not  presented  here  the  formal  statistical  analysis  of  the  quadratic 
model  for  brevity.  Umantsev  and  Chernozhukov  (1999)  offer  a  detailed  analysis  of 
the  quadratic  model. 

21  We  computed  VaR(-)  and  coefficients  for  values  of  p  lying  on  a  grid  with  cell 
size  .01  and  interpolated  in  between.  This  is  a  justifiable  interpolation  since  VaR(-) 
and  coefficient  processes  are  stochastically  equicontinuous. 

22  All  methods  are  in  a  form  that  is  suitable  for  dependent  data. 
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give  the  confidence  intervals  that  are  best  for  the  central  values  p  £  [.1,-9], 
method  (2)  -  for  the  intermediate  (near-extreme)  values,  p  €  [.04,  .96],  and 
methods  (3)  and  (4)  -  for  the  extreme  values  p  <E  (0,  .04]  and  [.96,  l).24 

As  can  be  seen  from  figure  2,  the  c.i.  by  methods  (2),  (3),  and  (4)  tend 
to  be  roughly  1.5,  2,  and  2.5  times  wider  than  the  standard  c.i.,  respectively. 
Hence  additional  significant  estimation  uncertainty  is  present  in  the  tails, 
and  it  is  important  to  properly  account  for  it.  Accounting  for  it  means  that, 
within  the  c.i.  by  methods  (2)- (4),  near-extreme  VaR  may  actually  be  as 
much  as  two  times  higher  than  the  point  estimates  suggest. 

Figures  4-5  present  the  same  analysis  for  the  coefficient  functions  6(-) 
of  the  linear  model.  The  methods  and  the  results  employed  are  like  those 
we  have  just  discussed.  We  will  give  an  economic  meaning  to  the  coefficient 
shapes  later. 

Specification  Analysis 

Figure  6  presents  the  pointwise  values  of  Wald  and  quasi-Score  test 
statistics  for  testing  the  hypothesis: 

—  Is  the  conventional  (unconditional  historical)  VaR  model  statistically  not 
different  from  the  conditional  VaR  model? 

The  answer  is  conclusive:  the  hypothesis  is  rejected  pointwise.  Note  that 
the  'p- values'25  for  this  case  are  all  smaller  than  0.01.  That  is,  the  regression 
conditioning  matters.  This  can  also  be  seen  in  figure  4,  where  the  confidence 
intervals  of  slope  coefficients  are  plotted  throughout  the  interesting  range 
of  p  -  these  confidence  intervals  exclude  0s. 

Figure  6  (right)  also  depicts  the  specification  test  process  (see  earlier 
section).  Results  of  the  specification  testing  are  clearly  in  favor  of  the  linear 
model:  the  critical  value  (pointwise)  for  10%  level  is  6.25,  which  is  above 


Based  on  Monte-Carlo  with  the  sample  size  of  1000  and  the  considerations  of 
the  previous  section. 

In  this  application,  for  transparency  and  clarity,  the  subsampling  methods 
were  operationalized  by  assuming  the  tails  are  exactly  algebraic,  so  that  the  rate 
of  convergence  or  divergence  is  ar  =  T~*.  £  was  estimated  to  be  approximately 
.25  by  the  method  described  in  the  previous  section.  Hence  ar  =  T~'25  defined 
a  scaling  for  the  subsampling  procedure.  As  suggested  in  Chernozhukov  (1999a), 
the  centering  constant  was  taken  to  be  J3T{k/b).  The  subsample  size  b  was  set 
to  be  1/10  of  the  whole  sample  T.  The  resulting  confidence  intervals  are  labeled 
"Subsampling  II."  For  comparison,  rate  aT  =  T01  was  also  used,  and  the  resulting 
confidence  intervals  were  labeled  as  "Subsampling  I." 
25   Not  to  be  confused  with  p  in  VaR(p). 
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any  of  the  values  depicted.  Obviously,  since  the  critical  value  for  the  test 
statistic  supT6[001j  99j  5c(t)  should  be  above  6.25,  the  linear  model  passes 
this  stronger  Kolmogorov-type  test,  too  ! 

The  Determinants  of  Risk 

We  now  provide  both  a  statistical  and  an  economic  interpretation  of 
the  coefficient  functions  #;(•).  Let  us  fix  time  period  t  =  2500  and  suppose 
for  a  moment  that  9i(p)  >  0  for  some  i  >  0,p  6  (0,1).  As  VaRt(p)  = 
vt(p)  =  0o(p)  +  J2i9i(p)Xt,i,  a  positive  coefficient  in  front  of  Xt,i  implies 
that  higher  values  of  Xtji  correspond  to  higher  values  of  VaRt(p),  given 
that  other  elements  of  Xt  are  unchanged.  Stated  differently,  if  6i(p)  >  0, 
then  increases  (decreases)  of  Xtti  are  associated  with  upward  (downward) 
shifts  of  VaRt{-)  at  point  p.  Note  that  VaRt(-)  is  the  "reversed"  inverse  of 
the  cdioiyt\Xt,  i.e.  F'^l  -  -\Xt)  [Take  figure  3  (middle)  and  rotate  it  90° 
clockwise  to  get  the  conditional  cdf.].  Thus  positive  shocks  in  Xt,i  shift  the 
cdf  Fyt(-\Xt)  to  the  right. 

Similarly,  if  9i(p)  is  negative,  effects  of  positive  and  negative  shocks  in 
the  ith  information  variable  are  reversed:  positive  shocks  move  VaRt{p) 
down  and  cdf  of  yt\Xt  to  the  left  and  negative  shocks  move  VaRt(p)  up  and 
cdf  of  yt\Xt  to  the  right. 

The  effects  described  above  are  local,  in  the  sense  that  they  affect  VaRt  (•) 
and  Fyt(-\Xt)  only  locally,  around  points  p  and  F'^l  —p\Xt),  respectively. 
Transformations  of  these  functions  at  other  points  caused  by  such  shocks 
depend  on  the  sign  and  magnitude  of  9i{p)  at  other  probability  levels  p. 

Suppose  next  that  9  is  positive  and  decreasing  in  the  right  tail  of  dis- 
tribution of  yt\Xt  (e.g.  #i(-)  on  (0,  .2),  see  figure  4).  A  positive  shock  in 
Xi  will  now  shift  the  entire  right  tail  of  cdf  of  yt\Xt  to  the  right,  and  the 
effect  will  be  greater  for  extreme  points  (those  close  to  p  =  0),  at  which 
9i(p)  is  higher.  The  effect  on  the  density  of  yt\Xt  is  schematically  depicted 
in  figure  7.  Thus,  this  particular  shape  of  6>i(-)  implies  that  positive  shocks 
of  the  corresponding  information  variable  result  in  the  right  tail  of  density 
of  yt\Xt  being  stretched  further  to  the  right  (more  positive  skewness  in  the 
right  tail).  A  similar  shape  is  observed  for  the  coefficient  function  ^(p)  for 
almost  all  values  of  p. 

Thus,  the  shapes  of  0j(-)  such  as  #i(-)  or  ^(-)  in  figure  4  translate  positive 
shocks  of  the  corresponding  information  variables  into  the  longer  right  tails 
(favorable  for  holding  long  positions  in  Y).  On  the  other  hand,  shapes  of 
9i(-)  similar  to  those  of  6>3(-)  in  figure  4  translate  such  shocks  into  shorter 
right-tails  (averse  effect  for  long  holders  of  Y).  (see  figure  7). 
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Finally,  we  provide  the  economic  interpretation  of  the  slope  coefficient 
functions  #i(-),  #2(')i  ^3(-)>  corresponding  to  the  lagged  returns  on  oil  spot 
price,  Xi,  equity  index,  X2,  and  price  of  the  security  in  question,  X3. 


—  9\(-)  is  significantly  positive  and  decreasing  in  the  right  tail  of  the  dis- 
tribution of  yt\Xt  (figures  4  and  5,  p  <  1/4).  It  is  insignificantly  positive 
in  the  middle  part  (for  p  €  (.25,  .95))  and  then  it  is  increasing  in  the 
far  left  tail  (p  >  .96),  although  values  Q\(p),  p  >  .96,  are  not  as  high 
as  those  for  p  <  1/4.  This  suggests  that  the  Spot  price  of  Oil  and  the 
return  on  our  stock  are  positively  related,  with  the  right  tail  of  equity 
return  being  much  more  sensitive  to  Oil  price  shocks,  than  the  left  tail. 
This  effect  can  be  explained  by,  for  example,  real  optionality  intrinsic  to 
the  operation  of  the  firm,  or  by  a  non-linear  hedging  policy  (e.g.,  long 
positions  in  put  options  instead  of  swaps  or  futures,  whose  payoff  is  lin- 
ear in  the  underlying  price  movements).  The  overall  effect  of  a  positive 
shock  in  the  spot  oil  price  X\  is  presented  in  figure  7. 

—  #2(-)>  m  contrast,  is  significantly  positive  for  all  values  of  p  with  the 
possible  exception  of  the  far  right  tail  of  yt\Xt,  (p  €  (0,0.04)).  We  also 
notice  a  moderate  increase  in  the  right  tail  and  a  sharp  increase  in  the  left 
tail  (p  close  to  1).  Thus,  in  addition  to  the  strong  positive  relation  between 
the  stock  return  on  the  individual  stock  and  the  market  return  (DJI) 
(dictated  by  the  fact  that  #2(-)  >  0  on  (0,1))  there  is  also  additional 
sensitivity  of  the  left  tail  of  the  security  return  to  the  market  movements 
(steep  increase  on  (.94, 1)),  which  is  strongly  consistent  with  the  notion 
of  highly  correlated  equity  returns  in  market  drops.  For  high  positive 
returns,  in  contrast,  market  return  has  a  much  weaker  effect  (low  values 
on  (0,0.04)).  The  effect  of  positive  shock  in  the  market  return  (X2)  is 
depicted  in  figure  7. 

—  #3(),  in  contrast,  is  significantly  negative,  except  for  values  of  p  close  to 
0.  This  may  be  clearly  interpreted  as  a  "mean  reversion"  effect  in  the 
central  part  of  the  distribution.  However,  X3,  the  lagged  return,  does 
not  appear  to  significantly  shift  the  quantile  function  in  the  tails.  Thus 
X3  is  more  important  for  the  determination  of  intermediate  risks  (values 
of  p  in  [.15,  .85]).  The  effect  of  a  positive  shock  in  X3  is  schematically 
portrayed  in  figure  7.  Figures  4  and  7  also  capture  the  asymmetry  of 
response  to  the  negative  and  positive  return  shocks-  a  positive  shock  leads 
to  mean  reversion  and  intermediate  risk  contraction,  whereas  a  negative 
shock  leads  to  mean  reversion  and  intermediate  risk  amplification. 
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Note  that  the  estimates  of  near-extreme  VaR  should  be  interpreted  care- 
fully, since  the  point  estimates  provided  by  regression  quantiles  are  highly 
biased  in  the  tails.  Some  correction  can  be  achieved  by  using  alternative 
estimators  that  use  the  regular  variation  properties  of  tails  in  order  to  con- 
struct regression-like  estimates  of  the  near-extreme  VaR  (see  previous  sec- 
tion). For  example,  as  depicted  in  Figure  3,  the  regression  extrapolation  es- 
timator introduces  a  significant  correction,  but  within  the  confidence  bands 
constructed  by  method  (4).  Further  conclusions  are  stated  in  section  1. 
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Fig.  1    VaR,(p)  for  Linear  (upper)  and  Qudratic  (lower)  Models  (  VaRt{p)  is  on 
the  vertical  axis,  and  (p,  i)  are  on  the  horisontal  axes.) 
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Fig.     4   Estimates     and     95%     (Pointwise)     Confidence     Intervals     for     6>0(-), 
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